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o 

o 

Amendments to the Specification: 

5 

Please amend the title as follows: 
Distributed Data Clustering System And Method- 



o 

<; 

CD 



Please amend paragraph 94 to correct the symbols as follows: 

[0094] [[=]] S^x) = 1 if x is closest to m k , otherwise [[=]] ^(x) = 0 (resolve ties 
arbitrarily). The summation of these functions over a data set (see (3) and (4)) residing on 
the ^ unit gives the count, first moment, [[=]] J3fc/ 5 and the second moment, s^t, of the 
clusters. The vector {n^j 9 [[=]] 2^, s^i | • . . 9 K}> has dimensionality 2*K+K*dim, 
which is the size of the SS that have to be communicated between the Integrator and each 
computing unit. 



Please amend paragraph 95 to correct the symbols as follows: 

[0095] The set of SS presented here is more than sufficient for the simple version of K- 
Means algorithm. The aggregated quantity, [[=]] Sk,b could be sent instead of the 
individual But there are other variations of K-Means performance functions that 
require individual s/^u for evaluating the performance functions. Besides, the quantities 
that dominate the communication cost are [[=]] 2*^. 

Please amend paragraph 96 to correct the symbols and the equation. In 
the equation, please note the distinction between the summation symbol and the 
subscripted variable E. 

[0096J The 7 th computing unit collects the SS, {w*/, [[=]] Z*,*, | . . . JQ, on the 
data in its own memory, and then sends them to the Integrator. The Integrator simply adds 
up the SS from each unit to get the global SS, 
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L L L 

Please amend paragraph 97 to correct the symbols as follows: 



[0097J The leading cost of integration is 0(^T-dim-Z), where L is the number of computing 
units. The new location of the 4 th center is given by mjr= [[=]] Ji/n* from the global SS 
(this is the /( ) function in (2)), which is the only information all the computing units need 
to start the next iteration. The performance function is calculated by (proof by direct 
verification), 

L 

p e r f KM = E^jfc • 

Please amend paragraph 102 to correct the equation. Note that the 
numerator in the first summation has changed from "1" to V. 



[0102] (K-Means is similar, except its weights are the nearest-center membership 
functions, making its centers centroids of the cluster.) Overall then, the recursion equation 
is given by 

1/^1 
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Please amend paragraph 103 to correct the equation. Note that the index 
for the summation is "k" rather than "A", and the denominators in the expression 
for g 2 and g 3 are cubed " 3 * rather than taken to the S power " s ". 

[0103] where rfuH[x-™*|| and s is a constant ^4. The decomposed functions for calculating 
SS (see (3) and (4)) are then 
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g 3 (x,M) = gf(x,M)' 
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Please amend paragraph 108 to correct the equation. Note that the 
subscript for p is "k" rather than "A" ( and the symbol for summation must be 
carefully distinguished from the subscripted variable Z. 



[0108] In this example, the EM algorithm with linear mixing of K bell-shape (Gaussian) 
functions is described. Unlike j£-Means and X-Harmonic Means in which only the centers 
are to be estimated, the EM algorithm estimates the centers, the co- variance matrices, 
and the mixing probabilities, /t(ot*)- The performance function of the EM algorithm is 



Porf tM (XM^p)^ log 
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PerfsM (X,M,X,p) = - log 
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w Reply to Notice of Allowance of July 29, 2005 

si 

Q! Please amend paragraph 111 to correct the equation. Note that the 

°0 symbol for summation must be carefully distinguished from the subscripted 

> variable 2*. 



<0 [0111] where p(x\m) is the prior probability with Gaussian distribution, and pim k ) is the 

mixing probability. 



det(2) 



k 



Pix\m k )= 1 . EX p(-(x-m k )X- { (x-m k f) 

V(2ar) D det(Z A ) 



(15) 



Please amend paragraph 112 to correct the equation. Note that the 
subscript for m is "k" rather than "A", and the symbol for summation must be 
carefully distinguished from the subscripted variable E. 

[0112] M-Step: With the fuzzy membership function from the E-Step, find the new center 
locations, new co-variance matrices, and new mixing probabilities that maximize the 
performance function. 

£ P(™k I*)'* X P(™k \*Mx-m k ) T (x-m k ) 

xeS xeS 
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Please amend paragraph 113 to correct the equation. Note that in the 
expression for f 1f the subscript for m is "k ,J rather than U A'\ 

[0113] The functions for calculating the SS are: 



/i(M/,S, J p)--loe 



f l (x,M 9 X,p) = -log 



K 

T*p( x \ m k)p(™k) 

g,(x,M,£,/?) = {j>{m[\x\p{m 2 \ x\...,p(m K \ x)) 
g 2 (x,M 9 %,p) = (p(m l | x)x,p(m 2 \ x)x,„.,p(m K \ x)x) 
g3(xM^ 9 p)"=\^(rn l \x)x T x,p(m 2 \ x)x T x,...,p(m K \ x)x T x) 



155254.01/216Z43BOO Page 6 Of 12 HP PDNO 10001360-1 

PACE 8/14 * RCVD AT 7/19/2005 4:15:02 PM [Eastern Daylight Time]* SVR:USPTO-EFXRF-5/25 * DN1S: 7464000 * CSID: 7132388008 * DURATION (mm-ss):04-16 



